A Randomized Algorithm for Learning Mahalanobis Metrics: Application to Classification and Regression of Biological Data
نویسنده
چکیده
We present a randomized algorithm for semi-supervised learning of Mahalanobis metrics over R. The inputs to the algorithm are a set, U> of unlabeled points in R, a set of pairs of points, S = {(x,y)i};x,y G U> that are known to be similar, and a set of pairs of points, D = {(x,y)i}]x,y G [/, that are known to be dissimilar. The algorithm randomly samples 5, D, and m-dimensional subspaces of R and learns a metric for each subspace. The metric over R is a linear combination of the subspace metrics. The randomization addresses issues of efficiency and overfitting. Extensions of the algorithm to learning non-linear metrics via kernels, and as a pre-processing step for dimensionality reduction are discussed. The new method is demonstrated on a regression problem (structure-based chemical shift prediction) and a classification problem predicting clinical outcomes for immunomodulatory strategies for treating severe sepsis).
منابع مشابه
Region Directed Diffusion in Sensor Network Using Learning Automata:RDDLA
One of the main challenges in wireless sensor network is energy problem and life cycle of nodes in networks. Several methods can be used for increasing life cycle of nodes. One of these methods is load balancing in nodes while transmitting data from source to destination. Directed diffusion algorithm is one of declared methods in wireless sensor networks which is data-oriented algorithm. Direct...
متن کاملRegion Directed Diffusion in Sensor Network Using Learning Automata:RDDLA
One of the main challenges in wireless sensor network is energy problem and life cycle of nodes in networks. Several methods can be used for increasing life cycle of nodes. One of these methods is load balancing in nodes while transmitting data from source to destination. Directed diffusion algorithm is one of declared methods in wireless sensor networks which is data-oriented algorithm. Direct...
متن کاملStock Price Prediction using Machine Learning and Swarm Intelligence
Background and Objectives: Stock price prediction has become one of the interesting and also challenging topics for researchers in the past few years. Due to the non-linear nature of the time-series data of the stock prices, mathematical modeling approaches usually fail to yield acceptable results. Therefore, machine learning methods can be a promising solution to this problem. Methods: In this...
متن کاملDetermination of Best Supervised Classification Algorithm for Land Use Maps using Satellite Images (Case Study: Baft, Kerman Province, Iran)
According to the fundamental goal of remote sensing technology, the image classification of desired sensors can be introduced as the most important part of satellite image interpretation. There exist various algorithms in relation to the supervised land use classification that the most pertinent one should be determined. Therefore, this study has been conducted to determine the best and most su...
متن کاملSFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy
In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006